Arabic Text Classification Framework Based on Latent Dirichlet Allocation
نویسندگان
چکیده
منابع مشابه
Arabic Text Classification Framework Based on Latent Dirichlet Allocation
Current research usually adopts Vector Space Model to represent documents in Text Classification applications. In this way, document is coded as a vector of words; n-grams. These features cannot indicate semantic or textual content; it results in huge feature space and semantic loss. The proposed model in this work adopts a “topics” sampled by LDA model as text features. It effectively avoids t...
متن کاملSemi-supervised Latent Dirichlet Allocation for Multi-label Text Classification
This paper proposes a semi-supervised latent Dirichlet allocation (ssLDA) method, which differs from the existing supervised topic models for multi-label classification in mainly two aspects. Firstly both labeled and unlabeled learning data are used in ssLDA to train a model, which is very important for reducing the cost by manually labeling, especially when obtaining a fully labeled dataset is...
متن کاملMulti - label Classification Algorithm Based on Latent Dirichlet Allocation Model
Vector Space Model (VSM) is used frequently in Text Classification (TC). However, it is usually produces a high dimensional feature space which leads to huge cost of computation and storage. Recently, statistic topic model plays an important role in the field of Information Retrieval (IR), TC and Document Clustering. In this chapter, we try to use a kind of statistic model—Latent Dirichlet Allo...
متن کاملAurora Image Classification Based on Multi-Feature Latent Dirichlet Allocation
Due to the rich physical meaning of aurora morphology, the classification of aurora images is an important task for polar scientific expeditions. However, the traditional classification methods do not make full use of the different features of aurora images, and the dimension of the description features is usually so high that it reduces the efficiency. In this paper, through combining multiple...
متن کاملSimilarity Measures Based on Latent Dirichlet Allocation
We present in this paper the results of our investigation on semantic similarity measures at wordand sentence-level based on two fully-automated approaches to deriving meaning from large corpora: Latent Dirichlet Allocation, a probabilistic approach, and Latent Semantic Analysis, an algebraic approach. The focus is on similarity measures based on Latent Dirichlet Allocation, due to its novelty ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computing and Information Technology
سال: 2012
ISSN: 1330-1136
DOI: 10.2498/cit.1001770